Learning Large-Scale Poisson DAG Models based on OverDispersion Scoring

نویسندگان

  • Gunwoong Park
  • Garvesh Raskutti
چکیده

In this paper, we address the question of identifiability and learning algorithms for large-scale Poisson Directed Acyclic Graphical (DAG) models. We define general Poisson DAG models as models where each node is a Poisson random variable with rate parameter depending on the values of the parents in the underlying DAG. First, we prove that Poisson DAG models are identifiable from observational data, and present a polynomial-time algorithm that learns the Poisson DAG model under suitable regularity conditions. The main idea behind our algorithm is based on overdispersion, in that variables that are conditionally Poisson are overdispersed relative to variables that are marginally Poisson. Our algorithms exploits overdispersion along with methods for learning sparse Poisson undirected graphical models for faster computation. We provide both theoretical guarantees and simulation results for both small and large-scale DAGs.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning Quadratic Variance Function (QVF) DAG models via OverDispersion Scoring (ODS)

Learning DAG or Bayesian network models is an important problem in multi-variate causal inference. However, a number of challenges arises in learning large-scale DAG models including model identifiability and computational complexity since the space of directed graphs is huge. In this paper, we address these issues in a number of steps for a broad class of DAG models where the noise or variance...

متن کامل

Modelling count data with overdispersion and spatial effects

In this paper we consider regression models for count data allowing for overdispersion in a Bayesian framework. We account for unobserved heterogeneity in the data in two ways. On the one hand, we consider more flexible models than a common Poisson model allowing for overdispersion in different ways. In particular, the negative binomial and the generalized Poisson distribution are addressed whe...

متن کامل

Enabling Large-Scale Bayesian Network Learning by Preserving Intercluster Directionality

We propose a recursive clustering and order restriction (R-CORE) method for learning large-scale Bayesian networks. The proposed method considers a reduced search space for directed acyclic graph (DAG) structures in scoring-based Bayesian network learning. The candidate DAG structures are restricted by clustering variables and determining the intercluster directionality. The proposed method con...

متن کامل

Score tests for heterogeneity and overdispersion in zero-inflated Poisson and binomial regression models

Hall (2000) has described zero-inflated Poisson and binomial regression models that include random effects to account for excess zeros and additional sources of heterogeneity in the data. The authors of the present paper propose a general score test for the null hypothesis that variance components associated with these random effects are zero. For a zero-inflated Poisson model with random inter...

متن کامل

Disease Mapping and Regression with Count Data in the Presence of Overdispersion and Spatial Autocorrelation: A Bayesian Model Averaging Approach

This paper applies the generalised linear model for modelling geographical variation to esophageal cancer incidence data in the Caspian region of Iran. The data have a complex and hierarchical structure that makes them suitable for hierarchical analysis using Bayesian techniques, but with care required to deal with problems arising from counts of events observed in small geographical areas when...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015